UCalgary Math Camp: Probability

Last updated for August 2024

Len Goff

Section 1: Probability & random variables

## What is an event?
			---
			Start with a set of possible <span style="color:orange">outcomes</span>, called the <span style="color:blue">sample space</span> $\Omega$
			<center><img src="images/diesides.png" class="plain" align="center" width="45%"></center>
			
			<span style="color:green">Events</span> are just sets of outcomes, for example:
			<div class="fragment" data-fragment-index="1">
			<span>•</span> the event that I roll a three $\quad$
				<img style="vertical-align:middle" src="images/diesides3.png" class="plain" width="40%">
			</div>
			<div class="fragment" data-fragment-index="2">
			<span>•</span> the event that I roll an even number $\quad$
				<img style="vertical-align:middle" src="images/diesideseven.png" class="plain" width="40%">
			</div>
			<div class="fragment current-visible" data-fragment-index="3">
			<span>•</span> the event that I roll <i>any</i> number $\quad$
				<img style="vertical-align:middle" src="images/diesidesall.png" class="plain" width="40%">
			</div>

## Associating probabilites with events: discrete case
			---
			When rolling a so-called "fair-die", each individual outcome $\color{blue}{\omega} \in \Omega$ has an equal probability: $P(\omega) = \color{orange}{1/6}$.

To get the probability of an event $\color{green}{A}$, simply add this for each $\color{blue}{\omega} \in \color{green}{A}$.
			<center>
			<div>
			$\color{green}{A}$=
			<img style="vertical-align:middle" src="images/diesides3.png" class="plain" width="40%">
			<span class="fragment" data-fragment-index="1">$,\quad P(\color{green}{A})= \color{orange}{1/6} \times 1 = 1/6 $</span>
			</div>

<div class="fragment" data-fragment-index="2">
			$\color{green}{A}$=
			<img style="vertical-align:middle" src="images/diesideseven.png" class="plain" width="40%">
			<span class="fragment" data-fragment-index="3">$,\quad P(\color{green}{A})= \color{orange}{1/6} \times 3 = 1/2$</span>
			</div>
			
			<div class="fragment" data-fragment-index="4">
			$\color{green}{A}$=
			<img style="vertical-align:middle" src="images/diesidesall.png" class="plain" width="40%">
			<span class="fragment" data-fragment-index="5">$,\quad P(\color{green}{A})= \color{orange}{1/6} \times 6 = 1 \hspace{.6cm}$</span>
			</div>
			
			</center>

## Associating probabilites with events: continuous case
			- Now imagine a sample space <span style="color:blue">$\Omega$</span> that consists of any real number between 0 and 1
			
			<center><img style="vertical-align:middle" src="images/unitinterval.png" class="plain" width="30%"></center>
			
			- e.g. throwing a dart at a one-dimensional target
			- Suppose we want to construct a probability function $P$ that puts equal probability on each such number in $[0,1]$
			- Under such a distribution, what is the probability associated with a *single* point, e.g. $P(\\{0.3\\})$? <span class="fragment" data-fragment-index="2" style="color:red">$\quad 0!$</span>
			<br><br><div class="fragment" data-fragment-index="3" style="font-size:0.9em;color:red"><center>But now we have a puzzle: if $P(\\{x\\}) = 0$ for each $x \in [0,1]$, how can we have $P([0,1]) = 1$?</center></div>
			<br><div class="fragment" data-fragment-index="3" style="font-size:0.9em;color:black"><center>A solution comes from the notion of a *probability space*.</center></div>

## Associating probabilites with events: Kolmogorov's axioms
			<br>**Definition:** A **probability space** is a triple $(\color{blue}{\Omega}, \color{gray}{F}, \color{orange}{P})$ where
			<h4>
			
			- $\color{blue}{\Omega}$ is the "sample space" (a.k.a. "outcome space", it's set of possible outcomes)
			- $\color{gray}{F}$ is collection of events (subsets of $\color{blue}{\Omega}$) with the structure of a $\sigma$*-algebra* (next slide), and
			- $\color{orange}{P}$ is a function from $\color{gray}{F}$ to the real numbers
			
			<h3>where $\color{orange}{P}$ is such that:<h4>
			
			- For any $\color{green}{A} \in \color{gray}{F}, \quad \color{orange}{P}(\color{green}{A}) \in \mathbb{R} \textrm{ and } \color{orange}{P}(\color{green}{A}) \ge 0$
			- $\color{orange}{P}(\color{blue}{\Omega}) = 1$
			- For any *countable* set of disjoint sets $\color{green}{A_1, A_2}, \dots \color{green}{A_{\infty}}$ where $\color{green}{A_i} \in \color{gray}{F}$:
			
			$$\color{orange}{P}\left(\color{green}{\bigcup_{i} A_i}\right) = \sum_i \color{orange}{P}(\color{green}{A_i})$$
			(press down arrow for a review of set notation)

## Sets
			- A set is just a collection or group of items, e.g. the set of numbers 1, 2, and 5 is denoted: $\\{1,2,5\\}$. Order is not meaningful, so $\\{1,2,5\\} = \\{2,1,5\\}$, etc.
				- The set $\\{\\}$ containing no elements is called the null set and is denoted $\emptyset$
				- A set containing a single element, e.g. $\\{2\\}$ is called a *singleton*.
			- The *union* of two sets $A$ and $B$ is denoted $A \cup B$, and is the set of all elements that belong in either $A$ or $B$, e.g. $\\{1,2,5\\} \cup \\{2,9\\} = \\{1,2,5,9\\}$
			- The *intersection* of two sets $A$ and $B$ is denoted $A \cap B$, and is the set of all elements that belong to both $A$ and $B$, e.g. $\\{1,2,5\\} \cup \\{2,9\\} = \\{2\\}$
				- Sets are called *disjoint* if they have no elements in common, i.e. $A \cap B = \emptyset$
			- A set of sets (which I'll often refer to as a *collection* of sets} is a set in which each element is itself a set, e.g. $X = \\{ \\{1,2\\}, \\{3\\}, \emptyset \\}$
			- We can extend the notions of union and intersection to collections of sets, for example $\bigcup_{x \in X} x = \\{1,2,3\\}$ and $\bigcap_{x \in X} x = \emptyset$

## What is a $\sigma$*-algebra*?
			Recall, $\color{gray}{F}$ is a collection of the subsets of $\color{blue}{\Omega}$. But doesn't need to include all of them.<br><br>
			The elements $\color{green}{A} \in \color{gray}{F}$ are referred to as *measurable sets* or "events". These are the only sets to which we associate probabilities $\color{orange}{P}(\color{green}{A})$.

**Definition:** To be a $\mathbf{\sigma}$**-algebra**, $F$ must satisfy the following properties:
			- $\color{blue}{\Omega}$ is contained in $\color{gray}{F}$
			- If $\color{green}{A} \subseteq \color{blue}{\Omega}$ is in $\color{gray}{F}$, then so is the complement of $\color{green}{A}$ (its complement with respect to $\color{blue}{\Omega}$)
			- If a countable collection $\color{green}{A_1}, \color{green}{A_2}, \dots$ are each in $\color{gray}{F}$, then $\bigcup_{i} \color{green}{A_i}$ is in $\color{gray}{F}$
			
			<div class="fragment current-visible" data-fragment-index="0">
			<br><br> For example: the collection of the two sets $\Omega$ and $\emptyset$ is always a $\sigma$-algebra.
			</div>
			
			<div class="fragment" data-fragment-index="1">
			<br>More useful examples:
			
			- The powerset of $\color{blue}{\Omega}$ is always a $\sigma$*-algebra*, but it might be "too big" (see next slide).
			- The standard "Borel" $\sigma$*-algebra* for $\color{blue}{\Omega} = [0,1]$ starts with all *open intervals* in $\color{blue}{\Omega}$.
			</div>

## Why is the notion of a $\sigma-algebra$ necessary?
				- Keeping some of the subsets of $\color{blue}{\Omega}$ out of $\color{gray}{F}$ avoids technical complications that can arise when $\color{blue}{\Omega}$ is uncountably infinite.
				- For example, it can be proven that there exists no function $\color{orange}{P}$ defined on *all* subsets $\color{green}{A} \subseteq [0,1]$ satisfying both of the following properties in addition to countable additivity (press down for a sketch of the proof):
					- $\color{orange}{P}\left(\color{green}{[a,b]}\right) = \color{orange}{P}\left(\color{green}{[a,b)}\right) = \color{orange}{P}\left(\color{green}{(a,b]}\right) = \color{orange}{P}\left(\color{green}{(a,b)}\right)=b-a$
					- Translational invariance: $\color{orange}{P}\left(\color{green}{A}\right) = \color{orange}{P}\left(\color{green}{A \bigoplus r}\right)$ for all $r$, where $\color{green}{A \bigoplus r}$ increases each element of $\color{green}{A}$ by $r$ (wrapping around if needed)
				- Both of the above are properties we'd expect of the uniform distribution on $[0,1]$. Therefore, to define the uniform distribution we must leave $\color{orange}{P}\left(\color{green}{A}\right)$ undefined for some events $\color{green}{A} \subseteq [0,1]$.
					- The events $\color{green}{A}$ that we take out are given the name *non-measurable sets*. 
				- This is why we need the concept of a $\sigma$-algebra. Buit in practice, understanding these technicalities usually isn't important in practical uses of econometrics.

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<br>

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<span style="color:red">What does the event that $y=3$ correspond to in the table?

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<span style="color:red">What does the event that $x=2$ correspond to in the table?

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<span style="color:red">What does the event that $x+y$ is *odd* correspond to in the table?</span>

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td style="color:red">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<span style="color:red">What does the event that $\color{orange}{x=2}$ OR $\color{orange}{y=3}$ correspond to in the table?

## Back to rolling dice
			Suppose we have *two* fair dice? 
			
			Now our sample space is $\Omega = \\{(x,y): x,y \in \\{1,2, \dots 6\\}\\}$
			
			<table style="width:100%" borders=0>
			  <tr>
			   <center>First die</center>
			  </tr>
			  <tr>
				<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td style="color:orange">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td style="color:orange">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none">
				<td style="border:none">Second die</td><td><b>Three</b></td><td style="color:orange">1/36</td><td style="color:red">1/36</td><td style="color:orange">1/36</td><td style="color:orange">1/36</td><td style="color:orange">1/36</td><td style="color:orange">1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td style="color:orange">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td style="color:orange">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			  <tr style="border: none;">
				<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td style="color:orange">1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
			  </tr>
			</table><br>
			For any $\omega \in \Omega$, $P(\omega) = 1/36$.<br><br>
			<span style="color:red">What does the event that $\color{orange}{x=2}$ AND $\color{orange}{y=3}$ correspond to in the table?

Random variables

## Random variables induce a new probability $P_X$
			Let $X$ be a random variable on probability space $(\color{blue}{\Omega}, \color{gray}{F}, \color{orange}{P})$. Let $\mathcal{X} \subseteq \mathbb{R}$ be the range of $X$ on $\Omega$. Then for any $A \subseteq \mathcal{X}$ define:
			$$P_X(X \in A) := \color{orange}{P}(\\{\color{blue}{\omega}: X(\color{blue}{\omega}) \in A\\})$$			
			Technically, this gives us a new probability space: $(\color{blue}{\mathbb{R}}, \color{gray}{\mathcal{B}}, \color{orange}{P_X})$, where $\mathcal{B}$ is the so-called "Borel" $\sigma$-algebra on the real numbers.
			<br><br>We'll typically denote individual elements of $\mathcal{X}$ by lower-case $x$. These correspond to singleton sets $A$, so: $P_X(X = x) := P(\\{\omega: X(\omega) =x\\})$.
			<br><br>We'll often let the $X$ in $P_X$ be implicit, e.g. $P(X \le 16)$ or $P(X=4)$.
			<br><br>Technically, there is an underlying probability space $(\color{blue}{\Omega}, \color{gray}{F}, \color{orange}{P})$ lurking in the background, but for practical purposes we can work with random variables directly.

CDF examples

A "typical CDF" for a continuous random variable with unbounded support

CDF examples

A "typical CDF" for a discrete random variable

CDF examples

Random variables may be a mix of continuous and discrete

Examples of density functions

## A few properties of random variables
			
			<span>▶</span> If $X$ is a random variable, then so is $Y=g(X)$ for any "measurable" function $g(x)$, e.g. $X+1$ or $2X_i$ or $X^2$
			- The probability space for $Y$ is still $(\Omega, \mathcal{B}, P)$, only $P$ has changed.
			- We can characterize the $P$ for $Y$ through its CDF: $F_{Y}(y) = P(g(X) \le y)$
			- If $X$ has discrete support $x_1, x_2 \dots$ with probabilities $\pi_1, \pi_2 \dots$, then $g(X)$ has discrete support $g(x_1), g(x_2), \dots$ with same probabilities $\pi_1, \pi_2, \dots$
			
			<div class="fragment" data-fragment-index="1">
			<span>▶</span> More generally, if $X$ and $Y$ are random variables, then so is $g(X, Y)$
			
			- e.g. $X + Y$ or $X \cdot Y$ or $\min\\{X, Y\\}$
			- However, the distribution of e.g. $X + Y$ depends on the full joint-distribution of $(X, Y$) (we'll come back to this)
			</div><div class="fragment" data-fragment-index="2">
			<span>▶</span> A non-stochastic variable $x$ can be viewed as a "degenerate" random variable, meaning it puts all of its probability mass at a single point: $P(X=x) = 1$
			</span>

Expectation

## Expectation: general definition
			 Consider a random variable $X$ with CDF $F(x)$. The following is a general definition of the expectation operator that allows for $X$ to be discrete, continuous, or mixed.
			 
			 **Definition:** The expectation of $X$ is:
			 $$\scriptsize \begin{align} \mathbb{E}[X] &= \int_{-\infty}^\infty x\cdot dF(x)\\\\ &:=  \lim_{a \rightarrow -\infty, b \rightarrow \infty} \color{green}{\lim_{N \rightarrow \infty}} \sum_{n=1}^{N} \color{red}{\left\\{a+n\cdot \frac{b-a}{N}\right\\}}\cdot \left\\{\color{blue}{F\left(a+n\cdot \frac{b-a}{N}\right)-F\left(a+(n-1)\cdot \frac{b-a}{N}\right)}\right\\} \end{align}$$
			 For given values of $a,b,N$, imagine cutting the interval $[a,b]$ into $N$ regions of size $\frac{b-a}{N}$. The $n^{th}$ such region extends from $a+(n-1)\cdot \frac{b-a}{N}$ to $a+n\cdot \frac{b-a}{N}$.
			 - $\color{blue}{F\left(a+n\cdot \frac{b-a}{N}\right)-F\left(a+(n-1)\cdot \frac{b-a}{N}\right)}$ yields $P(X \in \textrm{ region }n)$.
			 - $\color{red}{\left\\{a+n\cdot \frac{b-a}{N}\right\\}}$ is the location of (the right end of) region $n$.
			 - $\color{green}{\lim_{N \rightarrow \infty}}$ takes the sum to an integral, and the $a,b$ limit covers full support of $X$.

Expectation for a mixed distribution

Suppose that $$F(x) = p \cdot F_c(x) + (1-p) \cdot F_d(x),$$ where $F_c(x)$ is a differentiable CDF with density $f(x)$, and $F_d(x)$ is a discrete CDF with associated probability mass function $\pi_j$ for support points $x_j$.

Note that the definition of $\mathbb{E}[X]$ is linear in the CDF $F(x)$.

This implies that the expectation is equal to $p$ times an expectation according to $F_c$, plus $1-p$ times an expectation according to $F_d$: $$\mathbb{E}[X] = \int_{-\infty}^\infty x\cdot dF(x) = \color{orange}{p} \cdot \int_{-\infty}^\infty x\cdot f(x)\cdot dx + \color{orange}{(1-p)}\cdot \sum_{j} x_j \cdot \pi_j$$

Conditional distributions and expectations

Example: let $X$ and $Y$ be two uniform $[0,1]$ random variables that are independent. Then $$F_{XY}(x,y) = F_X(x)\cdot F_Y(y) = x\cdot y$$

Source: https://academo.org/demos/3d-surface-plotter/?expression=x*y&xRange=0%2C1&yRange=0%2C1&resolution=25

Example: let $X$ and $Y$ be two uniform "logistic" random variables that are independent. Then $$F_{XY}(x,y) = F(x)\cdot F(y) \textrm{ where } F(t) = \frac{1}{1+e^{-t}}$$

Source: https://academo.org/demos/3d-surface-plotter/?expression=1%2F((1%2Be%5E(-x))*(1%2Be%5E(-y)))&xRange=-5%2C5&yRange=-5%2C5&resolution=25

In the last example, the joint PDF is $$f_{XY}(x,y) = \frac{d}{dx}F(x)\cdot \frac{d}{dy}F(y) \textrm{ where } \frac{d}{dt}F(t) = \frac{e^{-t}}{(1+e^{-t})^2}$$

Source: https://academo.org/demos/3d-surface-plotter/?expression=e%5E(-x)%2F(1%2Be%5E(-x))%5E2*e%5E(-y)%2F(1%2Be%5E(-y))%5E2&xRange=-5%2C5&yRange=-5%2C5&resolution=25

## Example: marginal distribution in the two-dice setting
				
				<table style="width:100%" borders=0>
				  <tr>
				   <center>First die</center>
				  </tr>
				  <tr>
					<th style="border: none;"></th><th style="border: none;"></th><th>One</th><th>Two</th><th>Three</th><th>Four</th><th>Five</th><th>Six</th>
				  </tr>
				  <tr style="border: none;">
					<td style="border: none;"> </td><td><b>One</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
				  </tr>
				  <tr style="border: none;">
					<td style="border: none;"> </td><td><b>Two</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
				  </tr>
				  <tr style="border: none">
					<td style="border:none">Second die</td><td><b>Three</b></td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td><td style="color:red">1/36</td>
				  </tr>
				  <tr style="border: none;">
					<td style="border: none;"> </td><td><b>Four</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
				  </tr>
				  <tr style="border: none;">
					<td style="border: none;"> </td><td><b>Five</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
				  </tr>
				  <tr style="border: none;">
					<td style="border: none;"> </td><td><b>Six</b></td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td><td>1/36</td>
				  </tr>
				</table><br>
				
				$P(Y=3) = \sum_{j=1}^6 \color{red}{P(X = j \textrm{ and } Y=3)}$, summing across third row.
				
				Similarly, for $P(X=j)$ we would simply sum down column $j$, across values of $Y$.

## The law of iterated expectations
			
			**Proposition** (the law of iterated expectations) $\mathbb{E}[Y] = \mathbb{E}\left[\mathbb{E}[Y|X]\right]$
			
			We'll prove the law of iterated expectations (LIE) for the case of a continuous $X$ and $Y$. The other cases are analagous. Starting from the RHS:
			
			
			$$ \small \require{cancel} \begin{align} \mathbb{E}\left[\mathbb{E}[Y|X]\right] &= \int_{x \in \mathbb{R}: f_X(x)>0} f_X(x) \cdot \mathbb{E}[Y|X=x]\cdot dx \\\\ &= \int_{x \in \mathbb{R}: f_X(x)>0} f_X(x) \cdot \left\\{ \int_{y \in \mathbb{R}} y \cdot f_{Y|X}(y|x) \cdot dy\right\\}\cdot dx \\\\ &= \int_{x \in \mathbb{R}: f_X(x)>0} \cancel{f_X(x)} \cdot \left\\{ \int_{y \in \mathbb{R}} y \cdot \frac{f_{XY}(x,y)}{\cancel{f_X(x)}} \cdot dy\right\\}\cdot dx\\\\ &= \int_{y \in \mathbb{R}} y \cdot \underbrace{\left\\{\int_{x \in \mathbb{R}: f_X(x)>0} f_{XY}(x,y) \cdot dx \right\\}}\_{=f_{Y}(y)} \cdot dy = \int_{y \in \mathbb{R}} y \cdot f_{Y}(y) \cdot dy = \mathbb{E}[Y] \end{align}$$

UCalgary Math Camp: Probability

Last updated for August 2024

Len Goff

Section 1: Probability & random variables

Random variables

CDF examples

CDF examples

CDF examples

A "typical CDF" for a continuous random variable with unbounded support

CDF examples

A "typical CDF" for a discrete random variable

CDF examples

Random variables may be a mix of continuous and discrete

Expectation

Expectation for a mixed distribution

Conditional distributions and expectations

Source: https://academo.org/demos/3d-surface-plotter/?expression=x*y&xRange=0%2C1&yRange=0%2C1&resolution=25

Source: https://academo.org/demos/3d-surface-plotter/?expression=1%2F((1%2Be%5E(-x))*(1%2Be%5E(-y)))&xRange=-5%2C5&yRange=-5%2C5&resolution=25

Source: https://academo.org/demos/3d-surface-plotter/?expression=e%5E(-x)%2F(1%2Be%5E(-x))%5E2*e%5E(-y)%2F(1%2Be%5E(-y))%5E2&xRange=-5%2C5&yRange=-5%2C5&resolution=25

Random vectors